Randomness in Competitions 
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We study the effects of randomness on competitions based on an elementary random process 
in which there is a finite probability that a weaker team upsets a stronger team. We apply this 
model to sports leagues and sports tournaments, and compare the theoretical results with empirical 
data. Our model shows that single-elimination tournaments are efficient but unfair: the number of 
games is proportional to the number of teams N, but the probability that the weakest team wins 
decays only algebraically with N. In contrast, leagues, where every team plays every other team, 
are fair but inefficient: the top %/N of teams remain in contention for the championship, while 
the probability that the weakest team becomes champion is exponentially small. We also propose 
a gradual elimination schedule that consists of a preliminary round and a championship round. 
Initially, teams play a small number of preliminary games, and subsequently, a few teams qualify 
for the championship round. This algorithm is fair and efficient: the best team wins with a high 
probability and the number of games scales as N 9 ^ 5 , whereas traditional leagues require N 3 games 
to fairly determine a champion. 
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I. INTRODUCTION 

Competitions play an important role in society [H-[3|, 
economics and politics. Furthermore, competitions 
underlie biological evolution and are replete in ecology, 
where species compete for food and resources @ . Sports 
are an ideal laboratory for studying competitions |7Hl0j . 
In contrast with evolution, where records are incomplete, 
the results of sports events are accurate, complete, and 
widely available [III ■ 

Randomness is inherent to competitions. The outcome 
of a single match is subject to a multitude of factors in- 
cluding game location, weather, injuries, etc, in addition 
to the inherent difference in the strengths of the oppo- 
nents. Just as the outcome of a single game is not pre- 
dictable, the outcome of a long series of games is also 
not completely certain. In this paper, we review a series 
of our studies that focus on the role of randomness in 
competitions [l3l - [l6j . Among the questions we ask are: 
What is the likelihood that the strongest team wins a 
championship? What is the likelihood that the weakest 
team wins? How efficient are the common competition 
formats and how "accurate" is their outcome? 

We introduce an elementary model where a weaker 
team wins against a stronger team with a fixed upset 
probability q, and use this elementary random process 
to analyze a series of competitions [13]. To help cali- 
brate our model, we first determine the favorite and the 
underdog from the win-loss record over many years of 
sports competition from several major sports. We find 
that the distribution of win percentage approaches a uni- 
versal scaling function when the number of games and 
the number of teams are both large. We then simulate 
a realistic number of games and a realistic number of 



teams, and demonstrate that our basic competition pro- 
cess successfully captures the empirical distribution of 
win percentage in professional baseball [l4j . Moreover, 
we study the empirical upset frequency and observe that 
this quantity differentiates professional sports leagues, 
and furthermore, illuminates the evolution of competi- 
tive balance. 

Next, we apply the competition model to single- 
elimination tournaments where, in each match, the win- 
ner advances to the next round and the loser is eliminated 
[ll|. We use the very same competition rules where the 
underdog wins with a fixed probability. Here, we intro- 
duce the notion of innate strength and assume that enter- 
ing the competition, the teams are ranked. We find that 
the typical rank of the winner decays algebraically with 
the size of the tournament. Moreover, the rank distribu- 
tion for the winner has a power-law tail. Hence, larger 
tournaments do produce stronger winners, but neverthe- 
less, even the weakest team may have a realistic chance of 
winning the entire tournament. Therefore, tournaments 
are efficient but unfair. 

Further, we study the league format, where every team 
plays every other team [l6| . We note that the number of 
wins for each team performs a biased random walk. Using 
heuristic scaling arguments, we establish that the top 
yN teams have a realistic chance of becoming champion, 
while it is highly unlikely that the weakest teams can 
win the championship. In addition, the total number of 
games required to guarantee that the best team wins is 
cubic in N. In this sense, leagues are fair but inefficient. 

Finally, we propose a gradual elimination algorithm as 
an efficient way to determine the champion. This hybrid 
algorithm utilizes a preliminary round where the teams 
play a small number of games and a small fraction of 
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the teams advance to the next round. The number of 
games in the preliminary round is large enough to ensure 
the stronger teams advance. In the championship round, 
each team plays every other team ample times to guaran- 
tee that the strongest team always wins. This algorithm 
yields a significant improvement in efficiency compared 
to a standard league schedule. 

The rest of this paper is organized as follows. In sec- 
tion II, the basic competition model is introduced and its 
predictions are compared with empirical standings data. 
The notion of innate team strength is incorporated in sec- 
tion III, where the random competition process is used to 
model single-elimination tournaments. Scaling laws for 
the league format are derived in section IV. Scaling con- 
cepts are further used to analyze the gradual elimination 
algorithm proposed in section V. Finally, basic features 
of our results are summarized in section VI. 



II. THE COMPETITION MODEL 

In our competition model, N teams participate in a 
series of games. Two teams compete head to head and, 
at the end of each match, one team is declared the winner 
and the other as the loser. There are no ties. 

To study the effect of randomness on competitions, we 
consider the scenario where there is a fixed upset proba- 
bility q that a weaker team upsets a stronger team [3, E3] ■ 
This probability has the bounds < q < 1/2. The 
lower bound corresponds to predictable games where the 
stronger team always wins, and the upper bound corre- 
sponds to random games. We consider the simplest case 
where the upset probability q does not change with time 
and is furthermore independent of the relative strengths 
of the competitors. 

In each game, we determine the stronger and the 
weaker team from current win-loss records. Let us con- 
sider a game between a team with k wins and a team 
with j wins. The competition outcome is stochastic: if 
k> j, 



evolves according to 
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with probability p, 
with probability q, 



(1) 



where p + q = 1. If = j , the winner is chosen randomly. 
Initially, all teams have zero wins and zero losses. 

We use a kinetic framework to analyze the outcome 
of this random process |17j | , taking advantage of the fact 
that the number of games is a measure of time. We ran- 
domly choose the two competing teams and update the 
time by t —> t + At, with At = 1/(2N), after each com- 
petition. With this normalization, each team participates 
in one competition per unit time. 

Let fk(t) be the fraction of teams with k wins at 
time t. This probability distribution must be normal- 
ized, f k = 1. In the limit N — > oo, this distribution 



dfk 
dt 



p(fk-iF k -i - f k F k ) 



(2) 



+ q(fk-lG k -x - f k G k ) + -{fl_ x - f k ) , 



for k > 0. Here we also introduced two cumulative dis- 
tribution functions: F k — X^=o fo lii the- fraction of 
teams with less than k wins and G k = Yl'jLk+i fi ^ s 
the fraction of teams with more than k wins. Of course, 
F k +G k -i = 1. The first two terms on the right-hand-side 
of @ account for games in which the stronger team wins, 
and the next two terms correspond to matches where the 
weaker team wins. The last two terms account for games 
between teams of equal strength (the numerical prefactor 
is combinatorial). Accounting for the boundary condition 
/_i = and summing the rate equations ([2]), we read- 
ily verify that the normalization ^ fc = 1 is preserved. 
The initial conditions are f k {0) = S k fi. 

In contrast to f k , the cumulative distribution functions 
obey closed evolution equations. In particular, the quan- 
tity F k evolves according to 



'i - 1) (fLi f, 



(3) 



which may be obtained by summing ([2]). The bound- 
ary conditions are Fq = and Fob = lj and the initial 
condition is F k (0) = 1 for k > 0. We note that the av- 
erage number of wins, (fc) = i/2, where (k) = ^ k kf kl 
follows from the fact that each team participates in one 
competition per unit time and that one win is awarded 
in each game. As (k) — J2 k k(F k +i — F k ), we can verify 
that d(k)/dt — 1/2 by summing the rate equations ([3]). 

We first discuss the asymptotic behavior when the 
number of games is very large. In the limit t — > oo, 
we use the continuum approach and replace the differ- 
ence equations ([3]) with the partial differential equation 

[am 



f + [ 9 -d-2,)F]f = 0. 



(4) 



According to our model, the weakest team wins at least 
a fraction q of its games, on average, and similarly, the 
strongest team wins no more than a fraction p of its 
games. Hence, the number of wins is proportional to 
time, k ~ t. We thus seek the scaling solution 



F k {t) ~ $ 



(5) 



Here and throughout this paper, the quantity $(a;) is the 
scaled cumulative distribution of win percentage; that is, 
the fraction of teams that win less than a fraction x of 
games played. The boundary conditions are $(0) = 
and $(cxi) = 1. 

We now substitute the scaling form (0 into 
((31), and find that the scaling function satisfies 
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FIG. 1: The cumulative distribution $(2;) versus win percent- 
age x for q = 1/4 at times t — 100 and t — 500. Also shown 
for reference is the limiting behavior (B). 

$'[(2; — q) — (1 — 2q)$] = where prime denotes 
derivative with respect to x. There are two so- 
lutions: $ = constant and the linear function 
$ = (% — q)/(l — 2q). Therefore, the distribution 
of win percentages is piecewise linear 

{0 < x < q, 
§5f q<x<P, (6) 
1 p < x. 

As expected, there are no teams with win percentage less 
than the upset probability q, and there are no teams with 
win percentage greater than the complementary proba- 
bility p. Furthermore, one can verify that (x) =1/2. The 
linear behavior in ((5]) indicates that the actual distribu- 
tion of win percentage becomes uniform, $' = l/(p — q) 
for q < x < p, when the number of games is very large. 

As shown in figure 1, direct numerical integration of 
the rate equation Q confirms the scaling behavior ([5]). 
Moreover, as the number of games increases, the function 
$(x) approaches the piecewise-linear function given by 
equation ([5]). However, there is a diffusive boundary layer 
near x = q and x = p, whose width decreases as t~ x l 2 in 
the long-time limit [181 ]. 

Generally, the win percentage is a convenient measure 
of team strength. For example, Major League Baseball 
(MLB) in the United States, where teams play ~ 160 
games during the regular season, uses win percentage to 
rank teams. The fraction of games won is preferred over 
the number of wins because throughout the season there 
are small variations between the number of games played 
by various teams in the league. 

The piecewise-linear scaling function in ([5]) holds in 
the asymptotic limits N — > 00 and t — > 00. To apply the 
competition model (Q} , we must use a realistic number of 
games and a realistic number of teams. To test whether 
the competition model faithfully describes the win per- 
centage of actual sports leagues, we compared the results 




FIG. 2: The cumulative distribution $(a;) versus win per- 
centage x for: (i) Monte Carlo simulations of the competition 
process (fl]) with g mo dd = 0.41, and (ii) Season-end stand- 
ings for Major League Baseball (MLB) over the past century 
(1901-2005). 



of Monte Carlo simulations with historical data for a va- 
riety of sports leagues In this paper, we give one 
representative example: Major League Baseball. 

In our simulations, there are N teams, each partici- 
pating in exactly t games throughout the season. In each 
match, two teams are selected at random, and the out- 
come of the competition follows the stochastic rule ((T|): 
with the upset probability q, the team with the lower win 
percentage is victorious, but otherwise, the team with the 
higher win percentage wins. At the start of the simulated 
season, all teams have an identical record. We treated 
the upset frequency as a free parameter and found that 
the value g mo dei = 0.41 best describes the historical data 
for MLB (N = 26 and t = 162). As shown in figured 
the competition model faithfully captures the empirical 
distribution of win percentages at the end of the season. 
The latter distribution is calculated from all season-end 
standings over the past century (1901-2005). 

In addition, we directly measured the actual upset fre- 
quency gdata from the outcome of all 163, 000 games 
played over the past century. To calculate the upset fre- 
quency, we chronologically ordered all games and recre- 
ated the standings at any given day. Then we counted 
the number of games in which the winner was lower in 
the standings at the time of the current game. Game lo- 
cation and the margin of victory were ignored. For MLB, 
we find the value <?data = 0.44, only slightly higher than 
the model estimate g mo dci = 0.41. 

The standard deviation in win percentage, a, defined 
by er 2 = (x 2 ) — (x) 2 , is commonly used to quantify 
parity of a sports league [2(J For example, in 

baseball, where the win percentage typically varies be- 
tween 0.400 and 0.600, the historical standard deviation 
is a = 0.084. From the cumulative distribution ([B]), 
it straightforwardly follows that the standard deviation 
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FIG. 3: The standard deviation a as a function of time t. 
Shown are results of numerical integration of the rate equation 
p| with q — 1/4. Also shown for reference is the limiting 
value (Too = 1/ (4\/3) . 



varies linearly with the upset probability, 



1/2 -q 
V3 ' 



(7) 



There is an obvious relationship between the predictabil- 
ity of individual games and the competitive balance of a 
league: the more random the outcome of an individual 
game, the higher the degree of parity between teams in 
the league. 

The standard deviation is a convenient quantity be- 
cause it requires only year-end standings, which consist of 
only N data points per season. The upset frequency, on 
the other hand, requires the outcome of each game, and 
therefore involves a much larger number of data points, 
Nt/2 per season. Yet, as a measure for competitive bal- 
ance, the upset frequency has an advantage [14(. As seen 
in figure [31 the quantity a consists of two contributions: 
one due to the intrinsic nature of the game and one due 
to the finite length of the season. For example, the large 
standard deviation a = 0.21 in the National Football 
League (NFL) is in large part due to the extremely short 
season, t = 16. Therefore, the upset frequency, which is 
decoupled from the length of the season, provides a more 
accurate measure of competitive balance [22U2H ] . 

The evolution of the upset frequency over time is truly 
fascinating (figure Although q varies over a nar- 
row range, this quantity can differentiate the four sports 
leagues. The historical data shows that MLB has consis- 
tently had the least predictable games, while NBA and 
NFL games have been the most predictable. The trends 
for q for these sports leagues are even more interest- 
ing. Certain sports leagues (MLB and to a larger extent, 
NFL) managed to increase competitiveness by changing 
competition formats, increasing the number of teams, 
having unbalanced schedules where stronger teams play 
more challenging opponents, or using a draft where the 
weakest team can first pick the most promising upcoming 
talent. 




1900 1920 1940 1960 
year 



1980 2000 



FIG. 4: Evolution of the upset frequency q with time. Shown 
is data ,27] for: (i) Major League Baseball (MLB), (ii) the 
National Hockey League (NFL) (iii) the National Basketball 
Association (NBA), and (iv) the National Football League 
(NFL). The quantity q is the cumulative upset frequency for 
all games played in the league up to the given year. In foot- 
ball, a tie counts as one half of a win. 



In spite of the fact that NHL and NBA implemented 
some of these same measures to increase competitiveness, 
there are no clear long-term trends in the evolution of the 
upset probability in these two leagues. Another plausi- 
ble interpretation of figure 2] is that the sports leagues 
are striving to achieve an optimal upset frequency of 
5 w 0.4. One may even speculate that the various sports 
leagues compete against each other to attract public in- 
terest, and that making the games less predictable, and 
hence, more interesting to follow is a key objective in 
this evolutionary-like process Jf| [H, f2i|. In any event, 
the upset frequency is a natural and transparent mea- 
sure for the evolution of competitive balance in sports 
leagues. 

The random process |T]) involves only a single param- 
eter, q. The model does not take into account many as- 
pects of real competitions including the game score, the 
game location, the relative team strength, and the fact 
that in many sports leagues the schedule is unbalanced, 
as teams in the same geographical region may face each 
other more often. Nevertheless, with appropriate imple- 
mentation, the competition model specified in equation 
([l} captures basic characteristics of real sports leagues. 
In particular, the model can be used to estimate the dis- 
tribution of team win percentages as well as the upset 
frequency. 



III. SINGLE ELIMINATION TOURNAMENTS 

Thus far, our approach did not include the notion of in- 
nate team strength. Randomness alone controlled which 
team reaches the top of the standings and which teams 
reaches at the bottom. Indeed, the probability that a 
given team has the best record at the end of the season 
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equals 1/N. Furthermore, we have used the cumulative 
win-loss record to define team strength. However, this 
definition can not be used to describe tournaments where 
the number of games is small. 

We now focus on single-elimination tournaments, 
where the winner of a game advances to the next round 
of play while the loser is eliminated [lfl H3| • A single- 
elimination tournament is the most efficient competition 
format: a tournament with N = 2 r teams requires only 
N — 1 games through r rounds of play to crown a cham- 
pion. In the first round, there are N teams and the N/2 
winners advance to the next round. Similarly, the second 
round produces N /A winners. In general, the number of 
competitors is cut by half at each round 



N -> N/2 -> N/4 



1. 



(8) 



In many tournaments, for example, the NCAA college 
basketball tournament in the United States or in tennis 
championships, the competitors are ranked according to 
some predetermined measure of their strength. Thus, we 
introduce the notion of rank into our modeling frame- 
work. Let Xi be the rank of the ith team with 



X\ < X 2 < X3 < 



< X N . 



(9) 



In our definition, a team with lower rank is stronger. 
Rank measures innate strength, and hence, it does not 
change with time. Since ranking is strict, we use the uni- 
form ranking scheme Xi = i /N without loss of generality. 

Again, we assume that there is a fixed probability q 
that the underdog wins the game, so that the outcome 
of each match is stochastic. When a team with rank x\ 
faces a team with rank X2, we have 



(x 1 ,x 2 ) -> 



with probability p, 
with probability q, 



(10) 



when x\ < x 2 . The important difference with (JTJ) is that 
the losing team is now eliminated. 

Let w\(x) be the distribution of rank for all competi- 
tors. This quantity is normalized, J Q dxw±(x) = 1. In a 
two-team tournament, the rank distribution of the win- 
ner, w^ix), is given by 

w 2 {x) = 2p Wl (x) [1 - Wi(x)] + 2qw 1 (x)W 1 (x), (11) 

where W\ (x) — dy wi (y) is the cumulative distribu- 
tion of rank. The structure of this equation resembles 
that of ((2J, with the first term corresponding to games 
where the favorite advances, and the second term to 
games where the underdog advances. Mathematically, 
there is a basic difference with Eq. j2j in that equation 
(jTTJ) does not contain loss terms. Again, ties are not al- 
lowed to occur. By integrating (fTTl) . we obtain the closed 

equation W 2 (x) = 2pW 1 {x) + (1 - 2p) [Wi(x)] 2 . 

In general, the cumulative distribution obeys the non- 
linear recursion equation 

W 2N {x) = 2pW N (x) + (1 - 2p) [W N (x)]\ (12) 
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FIG. 5: The cumulative distribution of rank. The quantity 
Wn{x) is calculated by iterating equation (|12p with q = 1/4. 



Here, Wn{x) = f£ ' dywi\r(y), and wn(x) is the rank dis- 
tribution for the winner of an A^-team tournament. The 
boundary conditions are Wn(0) = and Wn{oo) = 1. 
The prefactor 2 arises because there are two ways to 
choose the winner. The quadratic nature of equation (|12[) 
reflects that two teams compete in each match (competi- 
tions with three teams are described by cubic equations 
[3TM33I ] ^ . Starting with Wi(x) = x that corresponds to 
uniform ranking, W\(x) — 1, we can follow how the distri- 
bution of rank evolves by iterating the recursion equation 
(IT21 . As shown in figure [51 the rank of the winner de- 
creases as the size of the tournament increases. Hence, 
larger tournaments produce stronger winners. 

By substituting W\(x) = x into equation (fTS")) . we find 
W 2 {x) — (2p)x and in general, Wjq{x) — {2p) r x. This 
behavior suggests the scaling form 



W N {x) ~ V(x/Xi,), 



(13) 



where the scaling factor x* is the typical rank of the 
winner. This quantity decays algebraically with the size 
of the tournament, 



= 7V" 



13 = 



Mgg) 

In 2 



(14) 



When games are perfectly random (upset probability 
q = 1/2), the typical rank of the winner becomes inde- 
pendent of the number of teams, /3(q = 1/2) = 0. When 
the games are highly predictable, the top teams tend to 
win the tournament, /3(0) = 1. Again, the scaling behav- 
ior (fhf|) shows that larger tournaments tend to produce 
stronger champions. 

By substituting (ITS")) into (IT21 . we see that the scaling 
function ^(z) obeys the nonlocal and nonlinear equation 

V(2pz) = 2p^(z) + (1 - 2p)^ 2 {z). (15) 

The boundary conditions are 'I'(O) = and ^(00) = 1. 
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Prom equation (|15p , we deduce the asymptotic behaviors 



1 - Cz' 1 



0, 

oo, 



(16) 



with the scaling exponent 7 = M^jj< . The large- z behav- 
ior is obtained by substituting &(z) = 1 — U(z) into (fT5|) 
and noting that since U — > when 2 — > 00, the correction 
obeys the linear equation U(2pz) = 2qU(z). 

The large- z behavior of the scaling function ^f(z) gives 
the likelihood that a very weak team manages to win the 
entire tournament. The scaling behavior (|13l) is equiva- 
lent to wn{x) ~ (l/x*)tp(x/x*) with ij){z) = ^'(z). In 
the limit z — ¥ 0, the distribution approaches a constant 
ip(z) — » 1. However, the tail of the rank distribution is 
algebraic 



a 



1 



ln(2g) 
In(2p)' 



(17) 



when z — > 00. The exponent a > 1 increases monotoni- 
cally with p, and it diverges in the limit p — > 1 [34j . 



Moreover, the probability that the weakest team wins 
the tournament, Pm — q , decays algebraically with the 
total number of teams, P^ = N lnq / ln2 . In the following 
section, we discuss sports leagues and find that: (i) the 
rank distribution of the winner has an exponential tail, 
and (ii) the probability that the weakest team is crowned 
league champion is exponentially small. 

The scaling behavior (ITU)) indicates universal statis- 
tics when the size of the tournament is sufficiently large. 
Once rank is normalized by typical rank, the resulting 
distribution does not depend on tournament size. Fur- 
ther, the scaling law (|14l) and the power-law tail (|17l) re- 
flect that tournaments can produce major upsets. With 
a relatively small number of upset wins, a "Cinderella" 
team can emerge, and for this reason, tournaments can be 
very exciting. Furthermore, tournaments are maximally 
efficient as they require a minimal number of games to 
decide a champion. 

Figure [5] shows that our theoretical model nicely de- 
scribes empirical data [27j for the NCAA college basket- 
ball tournament in the United States 15]. In the current 
format, 64 teams participate in four sub-tournaments, 
each with TV = 16 teams. The four winners of each sub- 
tournament advance to the final four, which ultimately 
decides the champion. Prior to the tournament, a com- 
mittee of experts ranks the teams from 1 to 16. We note 
that the game schedule is not random, and is designed 
such that the top teams advance if there are no upsets. 

Consistent with our theoretical results, the NCAA 
tournament has been producing major upsets: the 11th 
seed team has advanced to the final four twice over the 
past 30 years. Moreover, only once did all of the four top- 
seeded teams advance simultaneously (2008). Our model 
estimates the probability of this event at 1/190, a figure 
that is of the same order of magnitude as the observed 
frequency 1/132. 
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FIG. 6: The cumulative distribution of rank for the NCAA 
college basketball tournament. Shown is the cumulative dis- 
tribution Wie(x) versus the rank x for (i) NCAA tournament 
data (1979-2006), (ii) Iteration of the equation (fT^|) . 



We also mention that in producing the theoreti- 
cal curve in figure [6l we used the upset frequency 
?modci = 0.18, whereas the actual game results yield 
?data = 0.28. This larger discrepancy (compared with 
the MLB analysis above) is due to a number of factors 
including the much smaller dataset (~ 7000 games) and 
the non-random game schedule. Indeed, our Monte-Carlo 
simulations which incorporate a realistic schedule give 
better estimates for the upset frequency [lj|. 



IV. LEAGUES 

We now discuss the common competition format in 
which each team hosts every other team exactly once 
during the season. This format, first used in English soc- 
cer, has been adopted in many sports. In a league of size 
N, each team plays 2(N—1) games and the total number 
of games equals N(N — 1). Given this large number of 
games, does the strongest team always wins the champi- 
onship? 

To answer this question, we assume that each team 
has an innate strength and rank the teams according to 
strength. Without loss of generality, we use the uniform 
rank distribution w(x) = 1 and its cumulative counter- 
part W(x) — x where < x < 1. Moreover, we implicitly 
take the large- A limit. Consider a team with rank x. The 
probability v(x) that this team wins a game against a 
randomly-chosen opponent decreases linearly with rank, 



v(x) — p — (2p — l)x, 



(18) 



as follows from v(x) = p[l — Wi(a;)] + qWi(x) [see also 
equation (|11))], Consistent with our competition rules ([T]) 
and (fTUl) . the probability v(x) satisfies q < v < p. 

Since team strength does not change with time, the 
average number of wins V(x,t) for a team with rank x 
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grows linearly with the number of games f, 
V(x,t) = v(x)t. 



(19) 



Accordingly, the number of wins of a given team performs 
a biased random walk: after each game the number of 
wins increases by one with probability v, and remains 
unchanged with the complementary probability 1 — v. 
Also, the uncertainty in the number of wins, AV, grows 
diffusively with t, 



AV(x,t) 



'Dt, 



(20) 



with diffusion coefficient D = v(l — v) fl7l ]. 

Let us assume that each team plays t games. If the 
number of games is sufficiently large, the best team has 
the most wins. However, at intermediate times, it is pos- 
sible that a weaker team has the most wins. For a team 
with strength to still be in contention at time t, the 
difference between its expected number of wins and that 
of the top team should be comparable with the diffusive 
uncertainty 



V(0,t) -V(x.,t) ~ AV(0,t). 



(21) 



We now substitute equations (fT5]) - (f2"0")) into this heuristic 
estimate and obtain the typical rank of the leader as a 
function of time, 



1 



(22) 



In obtaining this estimate, we tacitly ignored numeric 
prefactors, including in particular, the dependence on q. 

This crude estimate (l22t shows that the best team does 
not always win the league championship. Since t ~ N, 
we have 



1 



N 



(23) 



Since rank is a normalized quantity, the top v N of the 
teams have a realistic chance of emerging with the best 
record at the end of the season. Thus randomness plays 
a crucial role in determining the champion: since the re- 
sult of an individual game is subject to randomness, the 
outcome of a long series of games reflects this random- 
ness. 

We can also obtain the total number of games T needed 
for the best team to always emerge as the champion, 



T ~ N . 



(24) 



This scaling behavior follows by replacing x* in (|22[) with 
l/N which corresponds to the best team. For the best 
team to win, each team must play every other team Q(N) 
times! Alternatively the number of games played by each 
team scales quadratically with the size of the league. 
Clearly, such a schedule is prohibitively long, and we con- 
clude that the traditional schedule of playing each oppo- 
nent with equal frequency is neither efficient nor does it 
guarantee the best champion. 



10 
10* 
10' 
10 

Tier 

io 4 

10" 

io 2 

10' 

io ( 



- slope=3 
o simulation 



10" 



10 



N 



io J 



FIG. 7: The total number of games T needed for the best team 
to emerge as champion in a league of size N . The simulation 
results represent an average over 10 3 simulated sports leagues. 
Also shown for reference is the theoretical prediction. 



We confirmed the scaling law (|24|) numerically. In our 
Monte Carlo simulations, the teams are ranked from 1 
to N at the start of the season. We implemented the 
traditional league format where every team plays every 
other team and kept track of the leader defined as the 
team with the best record. We then measured the last- 
passage time [35[, that is, the time in which the best 
team takes the lead for good. We define the average of 
this fluctuating quantity as T [3(| |37| ■ As shown in figure 
the total number of games required is cubic. 

Again, we expect that the probability distribution 
w{x, t) that a team with rank x has the best record after 
t games is characterized by the scale x» given in (|221) 



w(x,t) ~ (l/x*)ip(x/x*). 



(25) 



Numerical results confirm this scaling behavior [16j . 
Since the number of wins performs a biased random walk, 
we expect that the distribution of the number of wins 
becomes normal in the long-time limit. Moreover, the 
scaling function in (l2~5j) has a Gaussian tail [l6[ 



(26) 



tp(z) ~ exp (—const, x z 2 ) 



as z — > oo. 

Using this scaling behavior, we can readily estimate 
the probability that worst team becomes champion (in 
the standard league format). For the worst team, x ~ 1, 
and the corresponding scaling variable in equation (|25[) 
is z ~ y/~N. Hence, the Gaussian tail (|2l)]) shows that the 
probability Pn that the weakest team wins the league is 
exponentially small, 



P, 



N 



exp (—const, x N) . 



(27) 



In sharp contrast with tournaments, where this proba- 
bility is algebraic, leagues do not produce upset champi- 
ons. Leagues may not guarantee the absolute top team 
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FIG. 8: Leagues versus tournaments. Shown is P n , the proba- 
bility that the n th -ranked team has the best record at the end 
of the season in the format of playing all opponents with equal 
frequency, and the probability that the n th -ranked team wins 
an iV-team single-elimination tournament. The upset proba- 
bility is q = 0.4 and N = 16. 



as champion, but nevertheless, they do produce worthy 
champions. 

To compare leagues and tournaments, we calculated 
the probability P n that the nth ranked team is champion 
for a realistic number of games N = 16 and a realistic 
upset probability q = 0.4 (figure [5]). For leagues, we 
calculated this probability from Monte Carlo simulations, 
and for tournaments, we used equation (|12|) . Indeed, the 
top four teams fare better in a league format while the 
rest of the teams are better off in a tournament. This 
behavior is fully consistent with the above estimate that 
the top v^/V teams have a realistic chance to win the 
league. 

What is the probability P t0 p that the top team ends the 
season with the best record in a realistic sports league? 
To answer this question, we investigated the four major 
sports leagues in the US: MLB, NHL, NFL, and NBA. We 
simulated a league with the actual number of teams N 
and the actual number of games t, using the empirical up- 
set frequencies (see figure [3]). All of these sports leagues 
have comparable number of teams, N rs 25. Surpris- 
ingly, we find almost identical probabilities for three of 
the sports leagues: (i) MLB with the longest season and 
most random games (t = 162, q = 0.44) has Ptop = 0.31, 
(ii) NFL with the shortest season but most deterministic 
games (t = 16, q = 0.37) has P top = 0.30, and (hi) NHL 
with intermediate season and intermediate randomness 
(t = 80, q = 0.41) has Pt op = 0.32. Standing out as an 
anomaly is the value Ptop = 0.45 for the NBA which has 
a moderate-length season but less random games (t = 80 
and q = 0.37). 

This interesting result reinforces our previous com- 
ments about sports leagues competing against each other 
for interest and our hypothesis that there are optimal 
randomness parameters. Having a powerhouse win ev- 
ery year does not serve the league well, but having the 



strongest team finish with the best record once every 
three years may be optimal. 



V. GRADUAL ELIMINATION ALGORITHM 

Our analysis demonstrates that single-elimination 
tournaments have optimal efficiency but may produce 
weak champions, whereas leagues which result in strong 
winners are highly inefficient. Can we devise a com- 
petition "algorithm" that guarantees a strong champion 
within a minimal number of games? 

As an efficient algorithm, we propose a hybrid sched- 
ule consisting of a preliminary round and a champi- 
onship round [l6j . The preliminary round is designed 
to weed out a majority of teams using a minimal number 
of games, while the championship round includes ample 
games to guarantee the best team wins. 

In the preliminary round, every team competes in t 
games. Whereas the league schedule has complete graph 
structure with every team playing every other team, the 
preliminary round schedule has regular random graph 
structure with each team playing against the same num- 
ber of randomly-chosen opponents. Out of the N teams, 
the M teams with the largest number of wins in the 
preliminary-round advance to the championship round. 
The number of games t is chosen such that the strongest 
team always qualifies. By the same heuristic argument 
(j2~Tj) leading to (|22|) . the top team ranks no lower than 
1/vt after t games. We thus require 



M 
~N 



1 

7r 



(28) 



and consequently, each team plays ~ (N/M) 2 prelim- 
inary games. The championship round uses a league 
format with each of the M qualifying teams playing M 
games against every other team. Therefore, the total 
number of games, T, has two components 



7V3 



(29) 



In writing this estimate, we ignore numeric prefactors, 
as well as the dependence on the upset frequency q. The 
quantity T is minimal when the two terms in (|29p are 
comparable (38[. Hence, the size of the championship 
round Mi and the total number of games T\ scale alge- 
braically with N, 



Mi ~ iV 3/5 , and Ti - N 9/5 . 



(30) 



Consequently, each team plays 0(7V 4 / 5 ) games in the 
preliminary round. Interestingly, the existence of a pre- 
liminary round significantly reduces the number of games 
from N 3 to N 9 / 5 . Without sacrificing the quality of the 
champion, the hybrid schedule yields a huge improvement 
in efficiency! 

We can further improve the efficiency by using multi- 
ple elimination rounds. In this generalization, there are 
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k 





1 


2 


3 


4 


CO 


Vk 



3 


3 
5 
9 
5 


15 
19 
27 
18 


57 
65 
81 
65 


195 
211 
243 
211 


1 
1 



TABLE I: The exponents and fik in equation (|31[l for k < 4 



fe — 1 consecutive rounds of preliminary play culminating 
in the championship round. The underlying graphical 
structure of the preliminary rounds is always a regular 
random graph, while the championship round remains a 
complete graph. Each preliminary round is designed to 
advance the top teams, and the number of games is suf- 
ficiently large so that the top team advances with very 
high probability. When there are fc rounds, we anticipate 
the scaling laws 



M k ~N" 



and T fe - 



(31) 



where M k is the number of teams advancing out of the 
first round and T k is the total number of games. Of 
course, when there are no preliminary rounds, = 1 
and ^o = 3. Following equation pip , the number of 
teams gradually declines in each round, 



N -> TV" 



N" 



-> N 



-> 1. (32) 



According to the first term in (|2T)]) . the number of 
games in the first round scales as N 3 /Mj* ~ jV 3 ~ 2,yfc , and 
therefore, the total number of games obeys the recursion 



T k ~ N 3 - 2l,k +r fe _i(iV fc ). 



(33) 



Indeed, if we replace Mi with A^ 1 in equation (j2"9")l we 
can recognize the recursion (|3"3"]l . The second term scales 
as N Vk ^ k - 1 and becomes comparable to the second when 
3 — 2v k = VkHk-1- Hence, the scaling exponents satisfy 
the recursion relations 



2 + /ifc-i 



and fi k = /ifc_iz/fe. 



(34) 



Using uq — 1 and fj,o — 3, we recover v\ = 3/5 and 
fii = 9/5 in agreement with (|30[) . The general solution 
of ©3) is 



1 - (2/3) fc 
l-(2/3) fc+1 ' 



l-(2/3) fc 



fe+i • 



(35) 



Hence, the efficiency is optimal, and the number of games 
becomes linear in the limit fc —¥ oo. For a modest number 
of teams, a small number of preliminary rounds, say 1-3 
rounds, may suffice. Indeed, with as few as four elimi- 
nation rounds, the number of games becomes essentially 
linear, \i± = 1.15. 

Interestingly, the result firx, — 1 indicates that cham- 
pionship rounds or "playoffs" have the optimal size M* 
given by 



ATl/3. 



(36) 



Gradual elimination is often used in the arts and sciences 
to decide winners of design competitions, grant awards, 
and prizes. Indeed, the selection process for prestigious 
prizes typically begins with a quick glance at all nom- 
inees to eliminate obviously weak candidates, but con- 
cludes with rigorous deliberations to select the winner. 
Multiple elimination rounds may be used when the pool 
of candidates is very large. 

To verify numerically the scaling laws (|30[) . we sim- 
ulated a single preliminary round followed by a champi- 
onship round. We chose the size of the preliminary round 
strictly according to pip and used a championship round 
where all Mi teams play against all Mi teams exactly 
Mi times. We confirmed that as the number of teams in- 
creases from N = 10 1 to 10 2 to 10 3 etc., the probability 
that the best team emerges as champion is not only high 
but also, independent of N. We also confirmed that the 
concept of preliminary rounds is useful for small N. For 
N = 10 teams, the number of games can be reduced by 
a factor > 10 by using a single preliminary round. 



VI. DISCUSSION 

We introduced an elementary competition model in 
which a weaker team can upset a stronger team with 
fixed probability. The model includes a single control pa- 
rameter, the upset frequency, a quantity that can be mea- 
sured directly from historical game results. This idealized 
competition model can be conveniently applied to a va- 
riety of competition formats including tournaments and 
leagues. The random competition process is amenable to 
theoretical analysis and is straightforward to implement 
in numerical simulations. Qualitatively, this model ex- 
plains how tournaments, which require a small number 
of games, can produce major upsets, and how leagues 
which require a large number of games always produce 
quality champions. Additionally, the random competi- 
tion process enables us to quantify these intuitive fea- 
tures: the rank distribution of the champion is algebraic 
in the former schedule but Gaussian in the latter. 

Using our theoretical framework, we also suggested an 
efficient algorithm where the teams are gradually elimi- 
nated following a series of preliminary rounds. In each 
preliminary round, the number of games is sufficient to 
guarantee that the best team qualifies to the next round. 
The final championship round is held in a league for- 
mat in which every team plays many games against every 
other team to guarantee that the strongest team emerges 
as champion. Using gradual elimination, it is possible to 
choose the champion using a number of games that is 
proportional to the total number of teams. Interestingly, 
the optimal size of the championship round scales as the 
one third power of the total number of teams. 

The upset frequency plays a major role in our model. 
Our empirical studies show that the frequency of upsets, 
which shows interesting evolutionary trends, is effective 
in differentiating sports leagues. Moreover, this quantity 



10 



has the advantage that it is not coupled to the length of 
the season, which varies widely from one sport to another. 
Nevertheless, our approach makes a very significant as- 
sumption: that the upset frequency is fixed and does not 
depend on the relative strength of the competitors. Cer- 
tainly, our approach can be generalized to account for 
strength-dependent upset frequencies [39(. We note that 
our single-parameter model fares better when the games 
tend to be close to random, and that model estimates for 



the upset frequency have larger discrepancies with the 
empirical data when the games become more predictable. 
Clearly, a more sophisticated set of competition rules are 
required when the competitors are very close in strength, 
as is the case for example, in chess |40l |. 
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